Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 317890 |
| Missing cells | 48964 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 34.0 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Categorical | 4 |
|---|---|
| DateTime | 1 |
| Numeric | 9 |
VERSIE has constant value "1.0" | Constant |
DATUM_BESTAND has constant value "2022-11-21" | Constant |
PEILDATUM has constant value "2022-11-01" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1895 distinct values | High cardinality |
BEHANDELEND_SPECIALISME_CD is highly overall correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly overall correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly overall correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly overall correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly overall correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly overall correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly overall correlated with AANTAL_PAT_PER_SPC | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 48964 (15.4%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.3579137) | Skewed |
Reproduction
| Analysis started | 2022-11-22 16:04:36.416574 |
|---|---|
| Analysis finished | 2022-11-22 16:05:04.358308 |
| Duration | 27.94 seconds |
| Software version | pandas-profiling vdev |
| Download configuration | config.json |
VERSIE
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 953670 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 317890 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 317890 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 317890 | |
| . | 317890 | |
| 0 | 317890 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 635780 | |
| Other Punctuation | 317890 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 317890 | |
| 0 | 317890 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 317890 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 953670 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 317890 | |
| . | 317890 | |
| 0 | 317890 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 953670 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 317890 | |
| . | 317890 | |
| 0 | 317890 |
DATUM_BESTAND
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 2022-11-21 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3178900 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-11-21 |
|---|---|
| 2nd row | 2022-11-21 |
| 3rd row | 2022-11-21 |
| 4th row | 2022-11-21 |
| 5th row | 2022-11-21 |
Common Values
| Value | Count | Frequency (%) |
| 2022-11-21 | 317890 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2022-11-21 | 317890 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1271560 | |
| 1 | 953670 | |
| - | 635780 | |
| 0 | 317890 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2543120 | |
| Dash Punctuation | 635780 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1271560 | |
| 1 | 953670 | |
| 0 | 317890 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 635780 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3178900 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1271560 | |
| 1 | 953670 | |
| - | 635780 | |
| 0 | 317890 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3178900 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1271560 | |
| 1 | 953670 | |
| - | 635780 | |
| 0 | 317890 | 10.0% |
PEILDATUM
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 2022-11-01 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3178900 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-11-01 |
|---|---|
| 2nd row | 2022-11-01 |
| 3rd row | 2022-11-01 |
| 4th row | 2022-11-01 |
| 5th row | 2022-11-01 |
Common Values
| Value | Count | Frequency (%) |
| 2022-11-01 | 317890 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2022-11-01 | 317890 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 953670 | |
| 1 | 953670 | |
| 0 | 635780 | |
| - | 635780 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2543120 | |
| Dash Punctuation | 635780 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 953670 | |
| 1 | 953670 | |
| 0 | 635780 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 635780 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3178900 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 953670 | |
| 1 | 953670 | |
| 0 | 635780 | |
| - | 635780 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3178900 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 953670 | |
| 1 | 953670 | |
| 0 | 635780 | |
| - | 635780 |
JAAR
Date
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2022-01-01 00:00:00 |
BEHANDELEND_SPECIALISME_CD
Real number (ℝ)
| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 437.52328 |
| Minimum | 301 |
|---|---|
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 986.54181 |
|---|---|
| Coefficient of variation (CV) | 2.2548327 |
| Kurtosis | 61.331728 |
| Mean | 437.52328 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 7.9525306 |
| Sum | 1.3908428 × 108 |
| Variance | 973264.75 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 305 | 44866 | |
| 313 | 41243 | |
| 303 | 36603 | |
| 330 | 25186 | 7.9% |
| 316 | 21651 | 6.8% |
| 308 | 17059 | 5.4% |
| 306 | 13337 | 4.2% |
| 324 | 13108 | 4.1% |
| 301 | 12777 | 4.0% |
| 304 | 10340 | 3.3% |
| Other values (18) | 81720 |
| Value | Count | Frequency (%) |
| 301 | 12777 | 4.0% |
| 302 | 6969 | 2.2% |
| 303 | 36603 | |
| 304 | 10340 | 3.3% |
| 305 | 44866 | |
| 306 | 13337 | 4.2% |
| 307 | 5548 | 1.7% |
| 308 | 17059 | 5.4% |
| 310 | 3492 | 1.1% |
| 313 | 41243 |
| Value | Count | Frequency (%) |
| 8418 | 4247 | 1.3% |
| 8416 | 529 | 0.2% |
| 1900 | 210 | 0.1% |
| 390 | 862 | 0.3% |
| 389 | 3351 | 1.1% |
| 362 | 4287 | 1.3% |
| 361 | 2279 | 0.7% |
| 335 | 3220 | 1.0% |
| 330 | 25186 | |
| 329 | 834 | 0.3% |
TYPERENDE_DIAGNOSE_CD
Categorical
| Distinct | 1895 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 101 | 1342 |
|---|---|
| 402 | 1301 |
| 403 | 1274 |
| 301 | 1273 |
| 201 | 1197 |
| Other values (1890) |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.3520054 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1065569 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 22 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 14 |
|---|---|
| 2nd row | 07 |
| 3rd row | 15 |
| 4th row | 09 |
| 5th row | 10 |
Common Values
| Value | Count | Frequency (%) |
| 101 | 1342 | 0.4% |
| 402 | 1301 | 0.4% |
| 403 | 1274 | 0.4% |
| 301 | 1273 | 0.4% |
| 201 | 1197 | 0.4% |
| 203 | 1189 | 0.4% |
| 401 | 1065 | 0.3% |
| 404 | 1054 | 0.3% |
| 802 | 1037 | 0.3% |
| 409 | 1030 | 0.3% |
| Other values (1885) | 306128 |
Length
| Value | Count | Frequency (%) |
| 101 | 1342 | 0.4% |
| 402 | 1301 | 0.4% |
| 403 | 1274 | 0.4% |
| 301 | 1273 | 0.4% |
| 201 | 1197 | 0.4% |
| 203 | 1189 | 0.4% |
| 401 | 1065 | 0.3% |
| 404 | 1054 | 0.3% |
| 802 | 1037 | 0.3% |
| 409 | 1030 | 0.3% |
| Other values (1885) | 306128 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 203860 | |
| 0 | 195295 | |
| 2 | 141199 | |
| 3 | 115462 | |
| 5 | 82184 | |
| 9 | 76832 | 7.2% |
| 4 | 75670 | 7.1% |
| 7 | 62727 | 5.9% |
| 6 | 55667 | 5.2% |
| 8 | 45901 | 4.3% |
| Other values (15) | 10772 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1054797 | |
| Uppercase Letter | 10772 | 1.0% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 2011 | |
| M | 1820 | |
| B | 1293 | |
| E | 910 | |
| Z | 898 | |
| D | 722 | 6.7% |
| A | 702 | 6.5% |
| F | 669 | 6.2% |
| C | 354 | 3.3% |
| K | 348 | 3.2% |
| Other values (5) | 1045 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 203860 | |
| 0 | 195295 | |
| 2 | 141199 | |
| 3 | 115462 | |
| 5 | 82184 | |
| 9 | 76832 | 7.3% |
| 4 | 75670 | 7.2% |
| 7 | 62727 | 5.9% |
| 6 | 55667 | 5.3% |
| 8 | 45901 | 4.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1054797 | |
| Latin | 10772 | 1.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 2011 | |
| M | 1820 | |
| B | 1293 | |
| E | 910 | |
| Z | 898 | |
| D | 722 | 6.7% |
| A | 702 | 6.5% |
| F | 669 | 6.2% |
| C | 354 | 3.3% |
| K | 348 | 3.2% |
| Other values (5) | 1045 |
Common
| Value | Count | Frequency (%) |
| 1 | 203860 | |
| 0 | 195295 | |
| 2 | 141199 | |
| 3 | 115462 | |
| 5 | 82184 | |
| 9 | 76832 | 7.3% |
| 4 | 75670 | 7.2% |
| 7 | 62727 | 5.9% |
| 6 | 55667 | 5.3% |
| 8 | 45901 | 4.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1065569 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 203860 | |
| 0 | 195295 | |
| 2 | 141199 | |
| 3 | 115462 | |
| 5 | 82184 | |
| 9 | 76832 | 7.2% |
| 4 | 75670 | 7.1% |
| 7 | 62727 | 5.9% |
| 6 | 55667 | 5.2% |
| 8 | 45901 | 4.3% |
| Other values (15) | 10772 | 1.0% |
ZORGPRODUCT_CD
Real number (ℝ)
| Distinct | 6008 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.4149605 × 108 |
| Minimum | 10501002 |
|---|---|
| Maximum | 9.9841808 × 108 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 10501002 |
|---|---|
| 5-th percentile | 28999038 |
| Q1 | 99799062 |
| median | 1.4959903 × 108 |
| Q3 | 9.90004 × 108 |
| 95-th percentile | 9.9051604 × 108 |
| Maximum | 9.9841808 × 108 |
| Range | 9.8791708 × 108 |
| Interquartile range (IQR) | 8.9020494 × 108 |
Descriptive statistics
| Standard deviation | 4.2918281 × 108 |
|---|---|
| Coefficient of variation (CV) | 0.97211019 |
| Kurtosis | -1.7407123 |
| Mean | 4.4149605 × 108 |
| Median Absolute Deviation (MAD) | 1.1960003 × 108 |
| Skewness | 0.46412785 |
| Sum | 1.4034718 × 1014 |
| Variance | 1.8419788 × 1017 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 990004009 | 2321 | 0.7% |
| 990004007 | 2283 | 0.7% |
| 990003004 | 2220 | 0.7% |
| 990004006 | 1846 | 0.6% |
| 990356076 | 1688 | 0.5% |
| 990356073 | 1558 | 0.5% |
| 131999228 | 1483 | 0.5% |
| 131999164 | 1469 | 0.5% |
| 990003007 | 1447 | 0.5% |
| 131999194 | 1344 | 0.4% |
| Other values (5998) | 300231 |
| Value | Count | Frequency (%) |
| 10501002 | 9 | |
| 10501003 | 11 | |
| 10501004 | 11 | |
| 10501005 | 11 | |
| 10501007 | 3 | < 0.1% |
| 10501008 | 11 | |
| 10501010 | 11 | |
| 10501011 | 3 | < 0.1% |
| 11101002 | 10 | |
| 11101003 | 11 |
| Value | Count | Frequency (%) |
| 998418081 | 158 | |
| 998418080 | 142 | |
| 998418079 | 38 | < 0.1% |
| 998418077 | 8 | < 0.1% |
| 998418076 | 8 | < 0.1% |
| 998418075 | 6 | < 0.1% |
| 998418074 | 214 | |
| 998418073 | 214 | |
| 998418072 | 8 | < 0.1% |
| 998418071 | 8 | < 0.1% |
AANTAL_PAT_PER_ZPD
Real number (ℝ)
| Distinct | 10020 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 512.43038 |
| Minimum | 1 |
|---|---|
| Maximum | 165142 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 103 |
| 95-th percentile | 1739 |
| Maximum | 165142 |
| Range | 165141 |
| Interquartile range (IQR) | 100 |
Descriptive statistics
| Standard deviation | 3164.01 |
|---|---|
| Coefficient of variation (CV) | 6.1745169 |
| Kurtosis | 407.0903 |
| Mean | 512.43038 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 16.710391 |
| Sum | 1.6289649 × 108 |
| Variance | 10010959 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 52435 | 16.5% |
| 2 | 25640 | 8.1% |
| 3 | 16727 | 5.3% |
| 4 | 12335 | 3.9% |
| 5 | 9596 | 3.0% |
| 6 | 8151 | 2.6% |
| 7 | 6726 | 2.1% |
| 8 | 5793 | 1.8% |
| 9 | 5221 | 1.6% |
| 10 | 4640 | 1.5% |
| Other values (10010) | 170626 |
| Value | Count | Frequency (%) |
| 1 | 52435 | |
| 2 | 25640 | |
| 3 | 16727 | 5.3% |
| 4 | 12335 | 3.9% |
| 5 | 9596 | 3.0% |
| 6 | 8151 | 2.6% |
| 7 | 6726 | 2.1% |
| 8 | 5793 | 1.8% |
| 9 | 5221 | 1.6% |
| 10 | 4640 | 1.5% |
| Value | Count | Frequency (%) |
| 165142 | 1 | |
| 155884 | 1 | |
| 155025 | 1 | |
| 154269 | 1 | |
| 154184 | 1 | |
| 144724 | 1 | |
| 118395 | 1 | |
| 115938 | 1 | |
| 110520 | 1 | |
| 109675 | 1 |
| Distinct | 10729 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 604.99074 |
| Minimum | 1 |
|---|---|
| Maximum | 239709 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 15 |
| Q3 | 113 |
| 95-th percentile | 1983 |
| Maximum | 239709 |
| Range | 239708 |
| Interquartile range (IQR) | 110 |
Descriptive statistics
| Standard deviation | 4067.9191 |
|---|---|
| Coefficient of variation (CV) | 6.7239362 |
| Kurtosis | 726.84752 |
| Mean | 604.99074 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 21.357914 |
| Sum | 1.9232051 × 108 |
| Variance | 16547966 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 50514 | 15.9% |
| 2 | 25188 | 7.9% |
| 3 | 16575 | 5.2% |
| 4 | 12118 | 3.8% |
| 5 | 9513 | 3.0% |
| 6 | 8128 | 2.6% |
| 7 | 6698 | 2.1% |
| 8 | 5727 | 1.8% |
| 9 | 5179 | 1.6% |
| 10 | 4634 | 1.5% |
| Other values (10719) | 173616 |
| Value | Count | Frequency (%) |
| 1 | 50514 | |
| 2 | 25188 | |
| 3 | 16575 | 5.2% |
| 4 | 12118 | 3.8% |
| 5 | 9513 | 3.0% |
| 6 | 8128 | 2.6% |
| 7 | 6698 | 2.1% |
| 8 | 5727 | 1.8% |
| 9 | 5179 | 1.6% |
| 10 | 4634 | 1.5% |
| Value | Count | Frequency (%) |
| 239709 | 1 | |
| 232256 | 1 | |
| 231983 | 1 | |
| 230923 | 1 | |
| 227940 | 1 | |
| 227432 | 1 | |
| 223970 | 1 | |
| 222465 | 1 | |
| 218449 | 1 | |
| 215070 | 1 |
AANTAL_PAT_PER_DIAG
Real number (ℝ)
| Distinct | 8921 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7681.0088 |
| Minimum | 1 |
|---|---|
| Maximum | 227967 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 41 |
| Q1 | 407 |
| median | 1722 |
| Q3 | 6359 |
| 95-th percentile | 36582 |
| Maximum | 227967 |
| Range | 227966 |
| Interquartile range (IQR) | 5952 |
Descriptive statistics
| Standard deviation | 17806.728 |
|---|---|
| Coefficient of variation (CV) | 2.3182798 |
| Kurtosis | 34.399128 |
| Mean | 7681.0088 |
| Median Absolute Deviation (MAD) | 1566 |
| Skewness | 5.0786242 |
| Sum | 2.4417159 × 109 |
| Variance | 3.1707956 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21 | 532 | 0.2% |
| 8 | 498 | 0.2% |
| 17 | 475 | 0.1% |
| 9 | 467 | 0.1% |
| 12 | 457 | 0.1% |
| 25 | 453 | 0.1% |
| 26 | 452 | 0.1% |
| 14 | 427 | 0.1% |
| 32 | 421 | 0.1% |
| 11 | 420 | 0.1% |
| Other values (8911) | 313288 |
| Value | Count | Frequency (%) |
| 1 | 363 | |
| 2 | 408 | |
| 3 | 391 | |
| 4 | 401 | |
| 5 | 365 | |
| 6 | 403 | |
| 7 | 378 | |
| 8 | 498 | |
| 9 | 467 | |
| 10 | 376 |
| Value | Count | Frequency (%) |
| 227967 | 23 | |
| 221635 | 23 | |
| 217854 | 24 | |
| 214511 | 17 | |
| 213535 | 25 | |
| 211593 | 17 | |
| 210434 | 19 | |
| 205348 | 17 | |
| 200603 | 16 | |
| 198527 | 20 |
AANTAL_SUBTRAJECT_PER_DIAG
Real number (ℝ)
| Distinct | 9975 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11050.265 |
| Minimum | 1 |
|---|---|
| Maximum | 369837 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 52 |
| Q1 | 539 |
| median | 2384 |
| Q3 | 9076 |
| 95-th percentile | 51555 |
| Maximum | 369837 |
| Range | 369836 |
| Interquartile range (IQR) | 8537 |
Descriptive statistics
| Standard deviation | 26530.735 |
|---|---|
| Coefficient of variation (CV) | 2.4009139 |
| Kurtosis | 38.162106 |
| Mean | 11050.265 |
| Median Absolute Deviation (MAD) | 2187 |
| Skewness | 5.3355708 |
| Sum | 3.5127688 × 109 |
| Variance | 7.0387991 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 403 | 0.1% |
| 17 | 384 | 0.1% |
| 39 | 364 | 0.1% |
| 13 | 363 | 0.1% |
| 25 | 363 | 0.1% |
| 52 | 360 | 0.1% |
| 18 | 357 | 0.1% |
| 23 | 353 | 0.1% |
| 33 | 350 | 0.1% |
| 5 | 349 | 0.1% |
| Other values (9965) | 314244 |
| Value | Count | Frequency (%) |
| 1 | 288 | |
| 2 | 317 | |
| 3 | 320 | |
| 4 | 307 | |
| 5 | 349 | |
| 6 | 338 | |
| 7 | 317 | |
| 8 | 317 | |
| 9 | 267 | |
| 10 | 316 |
| Value | Count | Frequency (%) |
| 369837 | 23 | |
| 348523 | 25 | |
| 347198 | 23 | |
| 343084 | 24 | |
| 341692 | 19 | |
| 323791 | 20 | |
| 315781 | 17 | |
| 310778 | 17 | |
| 298646 | 17 | |
| 289045 | 16 |
AANTAL_PAT_PER_SPC
Real number (ℝ)
| Distinct | 297 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 669808.92 |
| Minimum | 1376 |
|---|---|
| Maximum | 1487642 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1376 |
|---|---|
| 5-th percentile | 42576 |
| Q1 | 287349 |
| median | 746974 |
| Q3 | 1026703 |
| 95-th percentile | 1340856 |
| Maximum | 1487642 |
| Range | 1486266 |
| Interquartile range (IQR) | 739354 |
Descriptive statistics
| Standard deviation | 413120.06 |
|---|---|
| Coefficient of variation (CV) | 0.61677301 |
| Kurtosis | -1.1197061 |
| Mean | 669808.92 |
| Median Absolute Deviation (MAD) | 314548 |
| Skewness | 0.019103111 |
| Sum | 2.1292556 × 1011 |
| Variance | 1.7066818 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 880942 | 5102 | 1.6% |
| 874126 | 4354 | 1.4% |
| 843981 | 4347 | 1.4% |
| 894337 | 4333 | 1.4% |
| 880504 | 4273 | 1.3% |
| 897712 | 4212 | 1.3% |
| 764815 | 4088 | 1.3% |
| 776544 | 3994 | 1.3% |
| 1081598 | 3890 | 1.2% |
| 1100675 | 3866 | 1.2% |
| Other values (287) | 275431 |
| Value | Count | Frequency (%) |
| 1376 | 117 | < 0.1% |
| 1610 | 130 | < 0.1% |
| 1702 | 138 | < 0.1% |
| 1920 | 131 | < 0.1% |
| 2255 | 183 | |
| 2495 | 173 | 0.1% |
| 6806 | 380 | |
| 8364 | 74 | < 0.1% |
| 11100 | 366 | |
| 11432 | 438 |
| Value | Count | Frequency (%) |
| 1487642 | 2975 | |
| 1450406 | 3048 | |
| 1421746 | 3564 | |
| 1344568 | 3543 | |
| 1340856 | 3441 | |
| 1332481 | 3545 | |
| 1316690 | 3463 | |
| 1282965 | 3576 | |
| 1265249 | 1177 | 0.4% |
| 1262541 | 1201 | 0.4% |
AANTAL_SUBTRAJECT_PER_SPC
Real number (ℝ)
| Distinct | 297 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1079155.5 |
| Minimum | 1578 |
|---|---|
| Maximum | 2666528 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1578 |
|---|---|
| 5-th percentile | 46629 |
| Q1 | 406625 |
| median | 1078699 |
| Q3 | 1728191 |
| 95-th percentile | 2550437 |
| Maximum | 2666528 |
| Range | 2664950 |
| Interquartile range (IQR) | 1321566 |
Descriptive statistics
| Standard deviation | 739599.49 |
|---|---|
| Coefficient of variation (CV) | 0.68535023 |
| Kurtosis | -0.79326322 |
| Mean | 1079155.5 |
| Median Absolute Deviation (MAD) | 649492 |
| Skewness | 0.37588312 |
| Sum | 3.4305275 × 1011 |
| Variance | 5.470074 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1211792 | 5102 | 1.6% |
| 1281527 | 4354 | 1.4% |
| 1216258 | 4347 | 1.4% |
| 1315603 | 4333 | 1.4% |
| 1300486 | 4273 | 1.3% |
| 1341872 | 4212 | 1.3% |
| 1155341 | 4088 | 1.3% |
| 1158417 | 3994 | 1.3% |
| 2550437 | 3890 | 1.2% |
| 2666528 | 3866 | 1.2% |
| Other values (287) | 275431 |
| Value | Count | Frequency (%) |
| 1578 | 117 | < 0.1% |
| 1861 | 130 | < 0.1% |
| 1962 | 138 | < 0.1% |
| 2195 | 131 | < 0.1% |
| 2816 | 173 | 0.1% |
| 3005 | 183 | |
| 7385 | 380 | |
| 8889 | 74 | < 0.1% |
| 12521 | 366 | |
| 12982 | 438 |
| Value | Count | Frequency (%) |
| 2666528 | 3866 | |
| 2620514 | 3787 | |
| 2595737 | 3844 | |
| 2564210 | 3781 | |
| 2550437 | 3890 | |
| 2482063 | 3851 | |
| 2179417 | 3757 | |
| 2062495 | 3810 | |
| 2052308 | 1168 | 0.4% |
| 1990249 | 1167 | 0.4% |
GEMIDDELDE_VERKOOPPRIJS
Real number (ℝ)
| Distinct | 3494 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 48964 |
| Missing (%) | 15.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3556.834 |
| Minimum | 70 |
|---|---|
| Maximum | 287220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 475 |
| median | 1245 |
| Q3 | 4135 |
| 95-th percentile | 13425 |
| Maximum | 287220 |
| Range | 287150 |
| Interquartile range (IQR) | 3660 |
Descriptive statistics
| Standard deviation | 6515.0129 |
|---|---|
| Coefficient of variation (CV) | 1.8316888 |
| Kurtosis | 148.9049 |
| Mean | 3556.834 |
| Median Absolute Deviation (MAD) | 1015 |
| Skewness | 7.2718965 |
| Sum | 9.5652513 × 108 |
| Variance | 42445393 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 160 | 1993 | 0.6% |
| 105 | 1919 | 0.6% |
| 110 | 1790 | 0.6% |
| 180 | 1571 | 0.5% |
| 185 | 1482 | 0.5% |
| 300 | 1377 | 0.4% |
| 175 | 1373 | 0.4% |
| 120 | 1362 | 0.4% |
| 145 | 1359 | 0.4% |
| 125 | 1234 | 0.4% |
| Other values (3484) | 253466 | |
| (Missing) | 48964 | 15.4% |
| Value | Count | Frequency (%) |
| 70 | 226 | 0.1% |
| 75 | 75 | < 0.1% |
| 80 | 362 | 0.1% |
| 85 | 919 | |
| 90 | 665 | 0.2% |
| 95 | 720 | 0.2% |
| 100 | 920 | |
| 105 | 1919 | |
| 110 | 1790 | |
| 115 | 978 |
| Value | Count | Frequency (%) |
| 287220 | 8 | |
| 148910 | 3 | < 0.1% |
| 142835 | 4 | |
| 122155 | 4 | |
| 116765 | 3 | < 0.1% |
| 109725 | 7 | |
| 108570 | 7 | |
| 107655 | 4 | |
| 101270 | 8 | |
| 96880 | 5 |
Auto
The auto setting is an interpretable pairwise column metric of the following mapping:- Variable_type-Variable_type : Method, Range
- Categorical-Categorical : Cramer's V, [0,1]
- Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
- Numerical-Numerical : Spearman's ρ, [-1,1]
This configuration uses the recommended metric for each pair of columns.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 14 | 990029010 | 4 | 4 | 97 | 97 | 21982 | 24170 | 1345.0 |
| 1 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 07 | 990029012 | 409 | 416 | 1439 | 1500 | 21982 | 24170 | 1040.0 |
| 2 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 15 | 990029010 | 181 | 189 | 1029 | 1075 | 21982 | 24170 | 1345.0 |
| 3 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 09 | 990029011 | 16 | 16 | 24 | 25 | 21982 | 24170 | 545.0 |
| 4 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 10 | 990029012 | 22 | 23 | 93 | 99 | 21982 | 24170 | 1040.0 |
| 5 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 14 | 990029011 | 61 | 61 | 97 | 97 | 21982 | 24170 | 545.0 |
| 6 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 10 | 990029002 | 14 | 14 | 93 | 99 | 21982 | 24170 | 205.0 |
| 7 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 12 | 990029012 | 34 | 34 | 114 | 120 | 21982 | 24170 | 1040.0 |
| 8 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 03 | 990029010 | 179 | 183 | 847 | 863 | 21982 | 24170 | 1345.0 |
| 9 | 1.0 | 2022-11-21 | 2022-11-01 | 2018-01-01 | 329 | 10 | 990029011 | 49 | 52 | 93 | 99 | 21982 | 24170 | 545.0 |
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 317880 | 1.0 | 2022-11-21 | 2022-11-01 | 2016-01-01 | 303 | 293 | 990356059 | 2 | 2 | 1713 | 2455 | 1332481 | 1832104 | 2235.0 |
| 317881 | 1.0 | 2022-11-21 | 2022-11-01 | 2016-01-01 | 303 | 206 | 199299053 | 1 | 1 | 1184 | 1309 | 1332481 | 1832104 | 4580.0 |
| 317882 | 1.0 | 2022-11-21 | 2022-11-01 | 2016-01-01 | 303 | 249 | 199299074 | 1 | 1 | 628 | 794 | 1332481 | 1832104 | 10705.0 |
| 317883 | 1.0 | 2022-11-21 | 2022-11-01 | 2012-01-01 | 303 | 434 | 99699054 | 1 | 1 | 902 | 1012 | 1487642 | 1939484 | 3505.0 |
| 317884 | 1.0 | 2022-11-21 | 2022-11-01 | 2012-01-01 | 303 | 280 | 199299089 | 1 | 1 | 111244 | 114318 | 1487642 | 1939484 | 2990.0 |
| 317885 | 1.0 | 2022-11-21 | 2022-11-01 | 2014-01-01 | 313 | 523 | 131999214 | 1 | 1 | 237 | 267 | 1037701 | 2062495 | 340.0 |
| 317886 | 1.0 | 2022-11-21 | 2022-11-01 | 2012-01-01 | 303 | 348 | 990003007 | 1 | 1 | 2384 | 3200 | 1487642 | 1939484 | 100.0 |
| 317887 | 1.0 | 2022-11-21 | 2022-11-01 | 2014-01-01 | 313 | 904 | 990003004 | 2 | 2 | 3279 | 13270 | 1037701 | 2062495 | 105.0 |
| 317888 | 1.0 | 2022-11-21 | 2022-11-01 | 2014-01-01 | 313 | 919 | 990003004 | 2 | 2 | 1752 | 1907 | 1037701 | 2062495 | 105.0 |
| 317889 | 1.0 | 2022-11-21 | 2022-11-01 | 2016-01-01 | 303 | 410 | 192001076 | 1 | 1 | 580 | 690 | 1332481 | 1832104 | NaN |